An Improved Proximal Policy Optimization Method for Low-Level Control of a Quadrotor

نویسندگان

چکیده

In this paper, a novel deep reinforcement learning algorithm based on Proximal Policy Optimization (PPO) is proposed to achieve the fixed point flight control of quadrotor. The attitude and position information quadrotor directly mapped PWM signals four rotors through neural network control. To constrain size policy updates, PPO Monte Carlo approximations optimal penalty coefficient. A optimization method with penalized probability distance can provide diversity by performing each update. new proxy objective function introduced into actor–critic network, which solves problem falling local optimization. Moreover, compound reward presented accelerate gradient along update direction analyzing various states that may encounter in flight, improves efficiency network. simulation tests generalization ability offline changing wing length payload Compared method, has higher better robustness.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

development and implementation of an optimized control strategy for induction machine in an electric vehicle

in the area of automotive engineering there is a tendency to more electrification of power train. in this work control of an induction machine for the application of electric vehicle is investigated. through the changing operating point of the machine, adapting the rotor magnetization current seems to be useful to increase the machines efficiency. in the literature there are many approaches wh...

15 صفحه اول

Robust Control of a Quadrotor

In this paper, a robust tracking control method for automatic take-off and trajectory tracking of a quadrotor helicopter is presented. The designed controller includes two parts: a position controller and an attitude controller. The attitude controller is designed by using the sliding mode control (SMC) method to track the desired pitch and roll angles, which are the output of position controll...

متن کامل

solution of security constrained unit commitment problem by a new multi-objective optimization method

چکیده-پخش بار بهینه به عنوان یکی از ابزار زیر بنایی برای تحلیل سیستم های قدرت پیچیده ،برای مدت طولانی مورد بررسی قرار گرفته است.پخش بار بهینه توابع هدف یک سیستم قدرت از جمله تابع هزینه سوخت ،آلودگی ،تلفات را بهینه می کند،و هم زمان قیود سیستم قدرت را نیز برآورده می کند.در کلی ترین حالتopf یک مساله بهینه سازی غیر خطی ،غیر محدب،مقیاس بزرگ،و ایستا می باشد که می تواند شامل متغیرهای کنترلی پیوسته و گ...

Improved Optimization Process for Nonlinear Model Predictive Control of PMSM

Model-based predictive control (MPC) is one of the most efficient techniques that is widely used in industrial applications. In such controllers, increasing the prediction horizon results in better selection of the optimal control signal sequence. On the other hand, increasing the prediction horizon increase the computational time of the optimization process which make it impossible to be imple...

متن کامل

Proximal Policy Optimization Algorithms

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a “surrogate” objective function using stochastic gradient ascent. Whereas standard policy gradient methods perform one gradient update per data sample, we propose a novel objective function that enables multiple epochs of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Actuators

سال: 2022

ISSN: ['2076-0825']

DOI: https://doi.org/10.3390/act11040105